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ABSTRACT 


The coronavirus (CoV) N protein oligomerizes via its carboxyl terminus. However, the oligomeriza- 
tion mechanism of the C-terminal domains (CTD) of CoV N proteins remains unclear. Based on the 
protein disorder prediction system, a comprehensive series of HCoV-229E N protein mutants with 
truncated CTD was generated and systematically investigated by biophysical and biochemical anal- 
yses to clarify the role of the C-terminal tail of the HCoV-229E N protein in oligomerization. These 
results indicate that the last C-terminal tail plays an important role in dimer-dimer association. 
The C-terminal tail peptide is able to interfere with the oligomerization of the CTD of HCoV-229E 
N protein and performs the inhibitory effect on viral titre of HCoV-229E. This study may assist the 
development of anti-viral drugs against HCoV. 
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1. Introduction 


Human coronavirus 229E (HCoV-229E), belonging to the alpha- 
coronaviruses, was first identified in the 1960s and has generally 
been associated with symptoms of the common cold [1,2]. 
Although HCoV-229E infections are generally mild, more severe 
upper and lower respiratory tract infections, such as bronchiolitis 
and pneumonia, have been well documented, particularly in in- 
fants, elderly individuals, and immunocompromised patients 


Abbreviations: HCoV, human coronavirus; CoV, coronavirus; RNP, ribonucleo- 
protein; N protein, nucleocapsid protein; S, spike; M, membrane; E, envelope; Tm, 
melting temperature; SR-rich, serine-arginine-rich; IBV, avian infectious bronchitis 
virus; SARS, severe acute respiratory syndrome; MHV, murine hepatitis virus; SDS, 
sodium dodecyl sulfate 
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[1,3,4]. There have also been reports that clusters of HCoV-229E 
infections cause pneumonia in otherwise healthy adults [2,5]. Sev- 
eral emerging human coronaviruses have recently been discovered 
[6-8]. Between 2003 and 2004, the severe acute respiratory syn- 
drome (SARS)-CoV caused a worldwide epidemic and had a signif- 
icant economic impact in the countries where the disease outbreak 
occurred [8]. Phylogenetic analyses have shown that SARS-CoV is 
closely related to the sequences of the betacoronaviruses [9]. In 
2004, another alphacoronavirus (HCoV-NL63) was isolated from a 
7-month-old child in the Netherlands suffering from bronchiolitis 
and conjunctivitis [7]. In 2005, Woo et al. described the discovery 
of a novel betacoronavirus, HKU1, which was found in patients 
with respiratory tract infections [10]. 

CoV particles have an irregular shape that consists of an outer 
envelope with distinctive, ‘club-shaped’ peplomers, giving the 
virus a crown (corona) appearance [11]. The viral genome of coro- 
naviruses consists of positive-sense, single-stranded RNA of 
approximately 30kb, and it contains several genes encoding 
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several structural and non-structural proteins that are required for 
progeny virion production [1]. The virion envelope surrounding the 
nucleocapsid contains the following structural proteins: S (spike) 
protein, M (membrane), E (envelope), and N (nucleocapsid). Some 
variants have a third glycoprotein, HE (hemagglutinin-esterase), 
which is present in most betacoronaviruses [12,13]. A helical 
nucleocapsid exists in the center of the viral particle [14-16]. 
Nucleocapsid protein, the major structural protein of CoVs, binds 
the viral RNA genome to form the virion core, leading to the forma- 
tion of a ribonucleoprotein (RNP) complex or to a long helical 
nucleocapsid structure [17,18]. The formation of the RNP is impor- 
tant for maintaining the RNA in an ordered conformation suitable 
for replication and transcription of the viral genome [17,19-21]. 
Previous studies have shown that the CoV N protein is involved 
in the regulation of cellular processes, such as gene transcription, 
actin reorganization, host cell cycle progression, and apoptosis 
[22-25]. The CoV N protein has also been shown to act as an 
RNA chaperone [26]. Moreover, the N protein is an important diag- 
nostic marker and immunodominant antigen in host immune re- 
sponses [21,27,28]. 

The N protein of HCoV-229E, which has a molecular weight of 
50 kDa, is highly basic (pI, 10.0), and it shows strong hydrophilicity 
[29]. The HCoV-229E N protein has 26-30% sequence homology 
with CoV N proteins from other strains or viruses, such as HCoV- 
OC43 and SARS [30]. Despite their low sequence homology, CoV 
N proteins from different strains can show a high level of conserva- 
tion in some motifs [30]. Chang et al. reported results from an or- 
der-disorder prediction and secondary structure prediction 
coupled with sequence alignment, which suggested that all CoV N 
proteins share the same modular organization [31]. Self-association 
of the N protein is an important step in virus particle assembly for 
many CoVs [32]. Previous studies have shown that full-length CoV 
N proteins can form high-order oligomers, and the C-terminal do- 
mains of the CoV N proteins are responsible for oligomerization 
[30,33-38]. Crystal structures of the C-terminal domains of CoV N 
proteins have been published and suggest that the basic building 
block for coronavirus nucleocapsid formation is the dimeric assem- 
bly of the N protein [34,39-41]. Luo et al. revealed that the CoV N 
protein might combine with viral genomic RNA to generate high- 
er-order oligomers, which could trigger the formation of the long 
nucleocapsid structure [32]. However, the oligomerization mecha- 
nism of the C-terminal domain of the HCoV-229E N protein remains 
unclear. The C-terminal tail has been found to participate in the 
oligomerization of the SARS-CoV N protein since the removal of 
40 aa from the C-terminus apparently decreased the ability of the 
protein to oligomerize [32]. The extreme C-terminal tail of the 
HCoV N protein was labeled as a separate functional domain [42]. 
We speculate that the C-terminal tail might constitute an important 
molecular determinant of oligomerization for HCoV-229E NP. How- 
ever, the recombinant full-length nucleocapsid N protein of the 
coronavirus is highly sensitive to proteolysis and aggregation that 
is difficult to analyze its oligomerization properties [21]. A compre- 
hensive series of HCoV-229E N protein mutants with truncated C- 
terminal domains was generated based on the PrDOS prediction 
to clarify the role of the C-terminal tail of the HCoV-229E N protein 
in oligomerization (Fig. 1). According to the order-disorder profiles 
obtained from the protein disorder prediction system (PrDOS), the 
predicted structure of HCoV-229 N protein contains one long or- 
dered region (N245-350) located in the C-terminal domain fol- 
lowed by three short regions predicted to be disordered (N351- 
370), ordered (N371-382), and disordered (N383-389) (Fig. 1) 
[30]. These truncations were systematically investigated by various 
biophysical and biochemical analyses. Understanding this mecha- 
nism would provide insight into the viral assembly process and 
could identify additional targets for drugs to combat CoVs through 
the disruption of N protein self-association. 
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Fig. 1. The order/disorder prediction for the full-length HCoV-229E N protein using 
the PrDOS program corresponding to the designations of the truncations, including 
N245-350, N245-370, N245-376, N245-382 and N245-389. PrDOS scores above a 
threshold value of 0.5 denote the disordered regions. 


2. Materials and methods 


The drugs and reagents were purchased from Sigma Chemical 
Co. All of oligoribonucleotides (or oligodeoxyribonucleotides) were 
synthesized using an automated DNA synthesizer and were puri- 
fied by gel electrophoresis. 


2.1. Expression and purification of the full-length and truncated N 
proteins 


The templates for the HCoV-229E N protein were kindly pro- 
vided by the Institute of Biological Chemistry, Academia Sinica 
(Taipei, Taiwan). To generate the truncated forms of the recombi- 
nant N proteins, the N protein gene was amplified by polymerase 
chain reaction (PCR) from the plasmid pGENT using various prim- 
ers. The PCR products were digested with Ndel and Xhol, and the 
DNA fragments were cloned into pET21b (Novagen) using T4 ligase 
(NEB). The induction of protein expression was initiated by adding 
IPTG to 1 mM followed by incubation at 37 °C for 6 h. After har- 
vesting the bacteria by centrifugation (6000 rpm, 30 min, 4 °C), 
the bacterial cells were lysed with lysis buffer (50 mM Tris-HCl, 
pH 7.3, 150 mM NaCl, and 15 mM imidazole). The lysate was clar- 
ified by centrifugation (15,000 rpm, 30 min, 4 °C) to obtain soluble 
proteins. The truncated C-terminal domains of the N protein with 
an C-terminal His6-tag were purified using a Ni-NTA column 
(Novagen) with an elution gradient from 15 to 250 mM imidazole 
in the buffer solution. The fractions containing the target proteins 
were collected and dialyzed against a low-salt buffer. 


2.2. Circular dichroism (CD) spectroscopy 


The CD spectra were obtained using a JASCO-815 CD spectropo- 
larimeter. The temperature was controlled by circulating water at 
the desired temperature in the cell jacket. Each protein was dis- 
solved in 50 mM Tris-HCl, pH 7.3, and 150 mM NaCl. The CD spec- 
tra were collected between 250 and 190nm with a 1nm 
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bandwidth at 1 nm intervals. All of the spectra were obtained from 
an average of five scans. The photomultiplier absorbance did not 
exceed 600 V during the analysis. The CD spectra were normalized 
by subtraction of a background scan with buffer alone. The mean 
residue ellipticity, [0], was calculated based on the equation 
[0] =MRW x 0,/10 x | x c, where MRW is the mean residue weight, 
0, is the measured ellipticity in millidegrees at wavelength 4, | is 
the cuvette path length (0.1 cm), and c is the protein concentration 
in g/ml. The results were analyzed using the CDSSTR program to 
calculate the percentage of each type of secondary structure [43]. 
In addition, the T,, was determined from the polynomial fitting 
of the observed curve and taken as the temperature corresponding 
to half denaturation of the N protein. The first derivative of the 
absorption with respect to temperature, dA/dT, of the melting 
curve was computer generated and used to determine the Ty. 


2.3. Chemical crosslinking assay 


To investigate the oligomerization features of N proteins, a 
chemical crosslinking experiment was performed. A series of pro- 
tein solutions containing N proteins were supplemented with var- 
ious concentrations of glutaraldehyde, and the reaction mixture 
was incubated at room temperature for 5 min. The reaction was 
stopped by adding 1 M Tris-HCl at pH 7.3 (0.5%, v/v, final concen- 
tration) and placing it on ice. The sample was then analyzed by 
SDS-PAGE. 


2.4, Fluorescence spectroscopy 


In the peptide-induced fluorescence quenching experiments, a 
final concentration of 5 uM N protein was added to a buffer that 
contained various concentrations of peptide, and the samples were 
incubated at 25°C for various durations. The buffer consisted of 
50 mM Tris-HCl (pH 7.5) and 150 mM NaCl. The tryptophan fluo- 
rescence was measured using a Hitachi F-4500 fluorescence spec- 
trophotometer that was equipped with a cuvette of a 1 cm light 
path. The excitation wavelength was 288 nm, and the emission 
data were collected between 300 and 400 nm. For the static mea- 
surements, all of the measurements were recorded in triplicate. 
To determine the binding constant between the peptide and the 
N proteins, the peptide-induced fluorescence changes (AF) from 
three separate experiments at 1 h after the addition of the test pep- 
tide were averaged and fit with the Hill equation using the Graph- 
Pad Prism software program (San Diego, CA) as follows: AF/ 
AF max = 1/[1 + (Ka/X)n], where AFmax is the saturating value of the 
fluorescence change, X is the drug concentration, Kg is the dissoci- 
ation constant, and n is the Hill coefficient [44]. 


2.5. Size-distribution analysis by analytical ultracentrifugation 


Sedimentation velocity experiments were performed using a 
Beckman Optima XL-A analytical ultracentrifuge. The sample solu- 
tions (380 ul) and the buffer solutions (400 tl) were loaded into 
the double-sector centerpiece separately and built up in a Beckman 
An-50 Ti rotor. The experiments were performed at 20 °C with a ro- 
tor speed of 42,000 rpm. The protein samples were monitored by 
measuring the UV absorbance at 280 nm in continuous mode with 
a time interval of 420s and a step size of 0.002 cm. Multiple scans 
at different time points were fit to a continuous size distribution 
model using the program SEDFIT [45] (Fig. $1). All of the size dis- 
tributions were solved at a confidence level of P= 0.95, a best-fit 
average anhydrous frictional ratio (f/fo), and a resolution N of 250 
sedimentation coefficients between 0.1 and 20.0S. To precisely 
determine the dimer-tetramer dissociation constants (Kg,24) of 
the C-terminal domains of the HCoV-229E N proteins in dimer-tet- 
ramer-oligomer equilibrium, sedimentation velocity experiments 


were performed for three different protein concentrations. The di- 
mer-tetramer dissociation constant (Kaz) of the C-terminal domains 
of the HCoV-229E N proteins was analyzed using the SEDPHAT pro- 
gram with a monomer-m-mer-n-mer self-association model [46]. 
The sedimentation velocity data collected for three different protein 
concentrations were globally fit with SEDPHAT. The solvent density 
and viscosity were calculated by the SEDNTERP program (Philo, 
J. website http://www.jphilo.mailway.com/default.htm). 


3. Results 


3.1. Oligomerization characterization of the HCoV-229E nucleocapsid 
protein C-terminus 


The C-terminal region of the HCoV N protein has been shown to 
mediate the self-association of the protein. The self-association 
mechanism of the full-length HCoV-229E N protein has been pre- 
viously reported [21,47]. To determine whether the C-terminal tail 
region (N351-389) plays an important role in the oligomerization 
of the C-terminal domain of the HCoV-229E N protein, the oligo- 
merization of several regions of the C-terminal domain of the N 
protein were analyzed using analytical ultracentrifugation. The dif- 
ferences in the size distributions among these truncated N proteins 
were analyzed by sedimentation velocity experiments, and the di- 
mer-tetramer dissociation constant (Ka24) for each was deter- 
mined. The dimer-tetramer dissociation constant (Ka24) reflects 
the affinity between two N protein dimers, with smaller numbers 
representing a higher tendency to oligomerize into tetramers. 
The truncated N245-350 protein predominantly displays a dimeric 
quaternary structure with a small amount of tetramers (Fig. 2A), 
exhibiting a Kg24 value of 256 uM. With an extended C-terminal 
tail that includes residue 370 or residue 376, the respective trun- 
cated N245-370 and N245-376 proteins also exist as dimers in 
solution with the major peaks exhibiting Ky24 values of 177 and 
159 uM, respectively (Fig. 2B and C). With an extended C-terminal 
tail that includes residue 382 or residue 389, the respective trun- 
cated N245-382 and N245-389 proteins demonstrated significant 
shifts in the equilibrium from dimers to tetramers and octamers, 
with significant decreases in the Kg24 values (3.83 and 3.50 uM, 
respectively) (Fig. 3A and B). These results indicate that residues 
377-389 at the end of the C terminus are necessary for the oligo- 
merization of the HCoV-229 N protein. 


3.2. Conformational and stability studies of the HCoV-229E 
nucleocapsid protein C terminus 


The conformation of the truncated C-terminal domain of the 
HCoV-229E N protein, including N245-350, N245-370, N245- 
376, N245-382, and N245-389, were monitored using CD spec- 
troscopy. As shown in Fig. 4A, the CD spectra of these truncated 
C-terminal domains of the HCoV-229E N protein were scanned 
from 190 to 250nm at 25°C. The CD spectrum of N245-350 
showed well-structured domains with o-helical and B-sheet sec- 
ondary structures as well as two negative peaks at approximately 
205 and 220nm, respectively. The CD spectra of N245-370, 
N245-376, N245-382, and N245-389, which contain extended C- 
terminal tails, showed increased intensities at approximately 205 
and 220 nm, suggesting that they possessed a different secondary 
structure composition compared to N245-350. These CD spectra 
were further analyzed by the CDPro software to determine the 
quantitative percentages of the secondary structure (Table 1). 
N245-350 contains approximately 51% o-helices, 22% B-sheets, 
12% turns, and 15% random coils at 25 °C. With an extended C-ter- 
minal tail that includes residue 370, N245-370 showed a signifi- 
cant increase in undefined structural content with 14% turns and 
17% random coils, consistent with the PrDOS prediction indicating 
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Fig. 2. The continuous sedimentation coefficient distributions of the different 
HCoV-229E N protein truncations (A) N245-350, (B) N245-370, and (C) N245-376. 
The protein concentrations were 50, 30 and 10 LM. The buffer consisted of 50 mM 
Tris-HCl (pH 7.5), 150 mM NaCl and 0.1% B-mercaptoethanol. Dimer and tetramer 
are denoted as D and T. 


that residues 351-370 are predicted to be disordered. Compared to 
N245-350, N245-370, and N245-376, N245-382 and N245-389 
showed significant increases in the B-sheet content because they 
may contain an ordered region in their C-terminal tail as predicted 
by the PrDOS prediction. 

We also measured the melting temperatures (TS) of the trun- 
cated C-terminal domains of the HCoV-229E N protein, including 
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Fig. 3. The continuous sedimentation coefficient distributions of the different 
HCoV-229E N protein truncations (A) N245-382 and (B) N245-389. The protein 
concentrations were 50, 30 and 10 tM. The buffer consisted of 50 mM Tris-HCl (pH 
7.5), 150 mM NaCl and 0.1% B-mercaptoethanol. Dimer, tetramer, and octamer are 
denoted as D, T, and O. 


N245-350, N245-370, N245-376, N245-382, and N245-389, 
using CD in which the absorbance at 220 nm was analyzed as a 
function of temperature (Fig. 4B). The heat denaturation analysis 
showed that the T,,s of N245-350, N245-370, and N245-376 were 
almost identical with values of 35.5, 36.2, and 36.7 °C, respectively. 
Interestingly, the T,s of N245-382 and N245-389, with values 
above 70 °C, were greater than N245-350, N245-370, and N245- 
376, indicating that oligomerization contributes significantly to 
the stability of the C-terminal domain of the N protein. The melting 
temperatures (TS) of the truncated C-terminal domains of the 
HCoV-229E N protein from CD studies were further confirmed by 
tryptophan (Trp) fluorescence analyses (Table S1). 


3.3. Interference of the oligomerization of the HCoV-229E nucleocapsid 
protein C terminus by a C-terminal tail peptide 


A previous observation indicated that a deletion mutant of the 
HCoV-229E N protein C-terminal domain lacking 13 amino acids 
from the C-terminal tail appeared incapable of a high degree of 
oligomerization. To explore whether a C-terminal tail peptide (res- 
idues 377-389) can compete with the oligomerization site of the 
C-terminal domain of the HCoV-229E N protein and interfere with 
its oligomerization, we synthesized a C-terminal tail peptide 
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Fig. 4. (A) The CD spectra of the HCoV-229E N protein truncations N245-350, 
N245-370, N245-376, N245-382 and N245-389. The protein concentration was 
5 uM, and the buffer consisted of 50 mM Tris-HCl (pH 7.5), 150 mM NaCl and 0.1% 
B-mercaptoethanol. (B) The thermostability measurements of N245-350, N245- 
370, N245-376, N245-382 and N245-389 monitored by CD at 220 nm. The protein 
concentration was 7 tM, and the buffer consisted of 50 mM Tris-HCl (pH 7.5), 
150 mM NaCl and 0.1% p-mercaptoethanol. 


Table 1 
The secondary structural content (%) of the truncated C-terminal domain of the HCoV- 
229E N protein as determined by CD analysis. 


Construct Helix Sheet Turn Disordered NRMSD 
No45-350 i ba 22 12 15 0.118 
N245-370 46 23 14 17 0.080 
No45-376 47 22 16 15 0.124 
No45-382 43 25 19 13 0.089 
No45-389 45 26 16 13 0.098 


* This value was determined by the CDSSTR program. 


(N377-389) and characterized the binding between N245-389 and 
this C-terminal tail peptide. First, we utilized fluorescence to mon- 
itor the protein/peptide binding because N245-389 contains one 
tryptophan residue, which contributes to its intrinsic fluorescence. 
The fluorescence emission spectra for N245-389 showed the max- 
imal emission wavelength at approximately 332 nm with a fluo- 
rescence intensity of 142.4 AU (Fig. 5A). At the C-terminal tail 
peptide concentrations of 5, 10, 20, 50, 75 and 100 uM, the C-ter- 
minal tail peptide decreased the fluorescence intensity of N245- 
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Fig. 5. (A) The fluorescence spectra of N245-389 in Tris-HCl buffer with different 
concentrations of N377-389 peptide. The protein concentration was 5 uM, and the 
buffer consisted of 50 mM Tris-HCl (pH 7.3), 150 mM NaCl and 0.1% B-mercap- 
toethanol. (B) Titration of the N245-389 protein with the N377-389 peptide in 
50 mM Tris (pH 7.5), 150 mM NaCl and 0.1% B-mercaptoethanol. The average of 
three experiments is shown. The data are expressed as a percentage of the maximal 
fluorescence change as determined by a fit to the Hill equation. 


389 at approximately 332 nm by 8.7, 19, 23.5, 25.4, 26.9 and 
28.9 AU, respectively, after 4h of peptide addition. The N245- 
389 fluorescence decreased with increasing concentrations of the 
C-terminal tail peptide, which suggests that this decrease reflected 
the interaction of N245-389 with the C-terminal tail peptide. As 
shown in Fig. 5B, the fluorescence quenching of N245-389 by the 
C-terminal tail peptide was analyzed with a Hill plot after the addi- 
tion of the peptide. The dissociation constant of the C-terminal tail 
peptide for N245-389 was calculated to be 7.43 LM. 

To quantify the effect of the C-terminal tail peptide on the di- 
mer-dimer association for N245-389, the dimer-tetramer dissoci- 
ation constant (Ka24) for the N245-389 protein was determined in 
the absence and presence of the C-terminal tail peptide. Sedimenta- 
tion velocity (SV) experiments with increasing concentrations of 
the C-terminal tail peptide were performed, and the data were 
globally fit to determine the dimer-tetramer dissociation constant 
of N245-389. Fig. 6 shows the distribution plots of the N245-389 
protein in the absence and presence of the C-terminal tail peptide. 
In the absence of the C-terminal tail peptide, N245-389 formed a 
stable dimer, tetramer, and octamer with S-values of approximately 
2.85, 5.30 and 7.75, respectively, corresponding to the molecular 
masses of 31, 69, and 131 kDa, respectively. When the concentra- 
tions of the C-terminal tail peptide were increased, the N245-389 
tetramer and octamer peak decreased, whereas the N245-389 
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Fig. 6. The continuous sedimentation coefficient distributions of the N245-382 
protein in the presence of the N377-389 peptide. The protein concentration was 
30 uM with two concentrations of the C-terminal tail peptide, N377-389, at 120 
and 300 uM. The buffer consisted of 50 mM Tris-HCl (pH 7.5), 150 mM NaCl and 
0.1% B-mercaptoethanol. Dimer, tetramer, and octamer are denoted as D, T, and O. 


dimer peak increased (Fig. 6). When the molar ratio of N245-389/ 
N377-389 was 0.25 and 0.1, the Kaz4 value for N245-389 was 
approximately 6.1 and 9.7 uM, respectively, and it is significantly 
higher than that of N245-389 in the absence of the C-terminal tail 
peptide. These results indicate that the N245-389 tetramer and 
octamer was significantly dissociated into dimers in the presence 
of the peptide (N377-389) and suggest that the C-terminal tail 
peptide may complete the tetramer interface of N245-389 and 
interfere with the oligomerization of N245-389. We further 
analyze the effects of the peptide, N377-389, on the cell viability 
and viral titre of HCoV-229E. The results showed that the cell viabil- 
ity was not affected by treatment with N377-389 alone (300 uM) 
for 24h in A549 cell lines (Fig. S2A). In addition, viral titre of 
HCoV-229E was inhibited by N377-389 at 300 uM, significantly 
(Fig. S2B). 


4. Discussion 


The C-terminal domains of the SARS-CoV and HCoV-0C43 N 
proteins mediate the self-association of the protein to form high- 
order oligomers. These oligomers exist predominantly as dimers 
[47,48]. The secondary structure alignment of the C-terminal do- 
mains from the HCoV-229E N protein with the corresponding pro- 
teins from SARS-CoV and IBV indicates that these proteins share 
very similar secondary structure profiles [49]. The crystal struc- 
tures of the C-terminal domains of SARS-CoV, IBV, and MHV N pro- 
teins show a similar general polypeptide fold, which strongly 
suggests that the dimerized N protein is the functional unit 
in vivo for the four groups of coronaviruses [35,39]. The crystal 
structure of the C-terminal domain shows a tightly intertwined 
twofold symmetric C-terminal domain dimer, with a §-hairpin 
(61 and B2) from one subunit extending into the cavity of the 
opposite subunit, which forms an antiparallel B-sheet with hydro- 
gen bonds occurring across the dimer interface [39]. Chang et al. 
proposed that all coronaviruses employ the same interface mecha- 
nism for the dimerization of the N protein [48]. Based on the crys- 
tal structures of the N proteins from SARS-CoV, MHV, and IBV, the 
dimeric C-terminal structural domain of the HCoV-229E N protein 
has been mapped to N245-350. Here, analytical ultracentrifuga- 
tion analysis consistently shows that the dimer appears to be the 
functional unit for the C-terminal domain of the HCoV-229E N pro- 
tein. The dimeric N245-350 self-associates into a small amount of 
tetramers. A crosslinking assay was also conducted to investigate 
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Fig. 7. A schematic mechanism of tetramer formation by the N protein showing 
that the N proteins form a tetramer through the interactions between the C- 
terminal tails of the dimer. 


oligomerization by the C-terminal domain of the N protein. As 
shown in Fig. S3A, the crosslinking studies of N245-350 also de- 
tected dimers and tetramers. 

The dimeric C-terminal structural domain of the HCoV-229E N 
protein (N245-350) is capped by the C-terminal tail (N351-389). 
Compared to N245-350, the crosslinking results showed that 
N245-389 forms dimers and tetramers as well as high-order olig- 
omers (Fig. S3B). The analytical ultracentrifugation results were 
consistent with that of the chemical crosslinking analysis. Our re- 
sults demonstrate that the C-terminal tail plays a crucial role in N 
protein oligomerization. According to the PrDOS prediction, the C- 
terminal tail (N351-389) is composed of disordered (N351-370), 
ordered (N371-382), and disordered (N383-389) regions. The 
truncations of the C-terminal domain of the HCoV-229E N protein, 
N245-370 and N245-376, which contain the first disordered re- 
gion of the C-terminal tail display a predominantly dimeric quater- 
nary structure with a small amount of tetramers and high-order 
oligomers and exhibit Ka24 values of 177 and 159 uM, respectively. 
Interestingly, when the dimeric C-terminal structural domain of 
the HCoV-229E N protein includes the C-terminal tail to either 
residue 382 or 389, containing the short ordered region, the 
respective proteins, N245-382 and N245-389, demonstrated equi- 
librium shifts toward tetramers and octamers, indicating that the 
last 13 residues of the C-terminal tail may play an important role 
in dimer-dimer association. A computer-assisted prediction of 
the secondary structure based on the amino acid sequence pre- 
dicted a short B-strand at the end of the C-terminal tail (Fig. S4), 
which is consistent with the CD results that showed that N245- 
382 and N245-389 contain higher p-sheet content compared to 
the other C-terminal domain truncations of the HCoV-229E N pro- 
tein. Therefore, we speculate that the hydrogen bonds across the 
tetramer interface formed by the main chain atoms of the short 
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B-strand may stabilize the oligomerization of the C-terminal do- 
main of the N protein through domain-swapping (Fig. 7). 

Oligomerization usually occurs through interfacial interactions 
in which subunits cooperatively interact with each other in several 
ways, including domain swapping and coiled-coil interaction [50]. 
Previous studies have shown that oligomerization makes a crucial 
contribution to the stability of proteins [51]. Many archaeal pro- 
teins have homo-oligomeric structures, and some reports have 
shown a correlation between oligomerization and the hyperther- 
mostability of archaeal proteins [52]. In this report, N245-382 
and N245-389 show a very high degree of stability compared to 
the other C-terminal domain truncations of the HCoV-229E N pro- 
tein due to a high degree of subunit interactions. The inhibition of 
viral N protein oligomerization by developing competing peptides 
and small organic compounds is an attractive therapeutic strategy 
against viral infection [53-55]. We showed that a peptide based on 
the C-terminal tail interfered with the oligomerization of the C-ter- 
minal domain of the HCoV-229E N protein, N245-389 and per- 
formed the inhibitory effect on viral titre of HCoV-229E. These 
results suggest that small molecules or peptides could be designed 
to target the oligomer interface as potential inhibitors of the CoV. 

An amino acid sequence alignment of the C-terminal domains 
from the HCoV-229E N protein and corresponding proteins from 
SARS-CoV, IBV and HCoV-OC43 using the MultAlin program re- 
veals low sequence homology in the C-terminal tail (Fig. S5) [49]. 
However, they share similar order-disorder profiles in the C-termi- 
nal domain according to the PrDOS prediction (Fig. S6), suggesting 
that the oligomerization feature described above may be conserved 
across different groups of Coronaviridae. This study may assist the 
development of drugs to disrupt the oligomerization of the viral N 
protein and viral assembly. 
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